Extracting Data from IPUMS

The American Community Survey (ACS) contains demographic, geographic, and education data for households and individuals. To analyze high school graduation for black and white populations ages 20-30 by state, we focus on five variables, YEAR, STATEFIP, AGE, RACE, and EDUC. After creating a data extract on IPUMS https://usa.ipums.org/usa-action/variables/group, we load the data in R using the ipumsr package and identify the relevant codes we want to analyze.

Defining High School Graduation

We define high school graduation as having attained at least a high school diploma. This does not include GED attainment as there is an array of papers describing how individuals with GED’s have lower lifetime earning potential relative to high school grads based on skills gained in school. Certainly, other factors play a role, but for our purposes, GED attainment is not included in the definition of high school graduation.

Creating a Binary Classification for “Graduted from HS”

We define a variable that takes a value of 1 if an individual has at least finished 12th grade, and 0 otherwise. This allows us to easily find the average graduation rate by taking the mean of the binary variable.

Plotting using ggplot

Graphs turn out better when using ggplot2, as the package offers a greater flexibility to display visual encodings than the base graphics package.

Interesting Findings

To preface, there are some strange things going on with states that have a smaller proportions of 20-30 year old black people relative to the population of the state.For example, Wyoming had only 4 individuals aged 20-30 who listed themselves as black in 2010. As such, we calculate graduation rate using individual weights from the survey, though this does not entirely solve this problem. The charts below can be improved by adding the number of black and white individuals in the sample used to calculate the average percentage. Although the charts do not currently include this useful tidbit of information, we can see some interesting trends.

For the most part, states that have a much smaller percentage of black individuals aged 20-30 compared to white individuals (i.e. Wyoming, Maine, the Dakotas, etc.), tend to have higher graduation rates among black people. On the other hand, states with a history of de facto segregation (i.e. Alabama, Missouri, South Carolina, Virginia, etc.) tend to have more consistent separation in graduation rates between black and white people, where the rate for white individuals remains higher throughout the timeline.

An interesting trend that holds for both white and black graduation rates is that there was a quite large spike in graduation rates around 2007, where rates jump from around 50% to over 70%. This could have something to do with No Child Left Behind policies affecting students who were in middle school at the time the law was passed, so they went through high school with more supports, perhaps.

Data Sources & Code

## Use of data from IPUMS-USA is subject to conditions including that users should
## cite the data appropriately. Use command `ipums_conditions()` for more details.